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Method and materials for producing deletion derivatives of polypeptides 



The present invention relates to genetic engineering and especially in vitro 
transposition. The invention describes a method and materials for producing deletion 
derivatives of polypeptide coding nucleic acids. In particular, the invention provides 
means for efficient generation of C-terminal deletions of polypeptides by the use of a 
modified transposon with translation stop codons in all three reading frames. The 
invention further provides a kit for producing said deletion derivatives. 



BACKGROUND OF THE INVENTION 



Thousands of different types of protein species constitute a major molecular component 
of cellular life. These molecules are composed of amino acid chains, the sequence of 
which is encoded by the genes in the organism's DNA. The protein function can be 
diverse and specific functions have been evolved for different cellular demands. Native 
wild type protein molecules can obviously be studied for their function biochemically 
and genetically. The data thus obtained can be informative but very often such 
information is relatively limited. A better description of protein function can be gained 
through mutational analysis in which various types of mutations are introduced into the 
protein primary sequence and the mutated proteins are then analyzed for their function. 
With current recombinant DNA technology (Sambrook et al. 1989, Sambrook and 
Russell 2001), generation of mutations is relatively easy and therefore mutational 
analysis of proteins has become a standard in functional studies of proteins. 

In principle, three different types of mutations can be introduced into a protein sequence 
(i) substitutions, (ii) insertions, and (iii) deletions. In a substitution mutation, a 
particular amino acid (or an amino acid stretch) in a protein is changed to another (or to 
another amino acid stretch of same length). In an insertion, an amino acid or a stretch of 
amino acids is added to the protein thus increasing the length of the amino acid chain. 
In a deletion mutation, an amino acid or a stretch of amino acids are eliminated from the 
protein sequence and thus the protein becomes smaller in size. 
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Various mutagenesis methods are currently available for generation of different types of 
mutations. These methods are typically straightforward to use. However, in most of the 
cases the wanted mutations are generated one by one and, therefore, their construction 
is time-consuming and labor-intensive. It would be desirable if a number of mutations 
5 could be generated simultaneously. For certain types of insertion mutations this type of 
approach has been described (Hayes and Hallet 2000). However, an efficient method for 
simultaneous generation of substitution and deletion mutations is still lacking. 

One of the in vitro transposition systems we utilised for the present invention was a 

10 bacteriophage Mu-derived transposition system that has recently been introduced 
(Haapa et al. 1999a) and shown to function efficiently in many types of molecular 
biology applications (Wei et al. 1997, Taira et al. 1999, Haapa et al 1999 ab, Vilen et al. 
2001). Mu transposition proceeds within the context of protein-DNA complexes that are 
called DNA transposition complexes or transpososomes (Mizuuchi 1991, Savilahti et al. 

15 1995). These complexes are assembled from a tetramer of MuA transposase protein and 
Mu-transposon-derived DNA-end-segments (i.e. transposon end sequences recognised 
by MuA) containing MuA binding sites. When the complexes are formed they can react 
in divalent metal ion-dependent manner with any target DNA and splice the Mu end 
segments into the target (Savilahti et al 1995). In the simplest case, the MuA 

20 transposase protein and a short 50 bp Mu right-end (R-end) fragment are the only 
macromolecular components required for transpososome assembly (Savilahti et al. 
1995, Savilahti and Mizuuchi 1996). Analogously, when two R-end sequences are 
located as inverted terminal repeats in a longer DNA molecule, transposition complexes 
form by synapsing the transposon ends. Target DNA in Mu DNA transposition in vitro 

25 can be linear, open circular, or supercoiled (Haapa et al. 1999a). 

Mu transposition complex, the machinery within which the chemical steps of 
transposition take place, is initially assembled from four molecules of MuA transposase 
protein that first bind specific binding sites in the transposon ends (Figs. 5 A and 5B). 
30 The 50 bp Mu right end DNA segment contains two of these binding sites (they are 

called Rl and R2 and each of them is 22 bp long, Savilahti et al. 1995). When two ends, 
each bound by two MuA monomers, meet, the transposition complex is formed through 
conformational changes, the nature of which are not fully understood because of a lack 
of atomic resolution structural data on Mu transpososomes. However, the assembly of 
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the minimal Mu transpososome is clearly dependent upon the correct binding of MuA 
transposase to Mu ends of the donor DNA. Thus, modifications in the conserved 
nucleotide sequence of transposon ends (e.g. Rl and R2 sequences in Mu R-end) should 
potentially have a negative effect on the efficiency of the transposition since every 

5 altered nucleotide conceivably interferes with the MuA binding. It has been documented 
(Lee and Harshey 2001, Coros and Chaconas 2001) that the two last base pairs in the 
Mu transposon end can be modified without severe effect on transpososome function. 
However, no detailed analysis has been conducted for elucidation of the effects of 
modified Rl and R2 binding sites. In one example (Laurent et al. 2000) a Notl 

10 restriction site was engineered close to the transposon end that changed one base pair in 
the Rl sequence. In vivo studies indicate that within the Rl and R2 sequences mutations 
generally have negative effects on transposition efficiency (Groenen et al. 1985, 1986). 
In addition, these effects are typically additive. 

15 SUMMARY OF THE INVENTION 

In this invention we describe a general methodology for making deletion derivatives of 
polypeptides using in vitro DNA transposition system. The method of the invention can 
be used to generate a number of deletion-derivatives of polypeptide coding nucleic 
20 acids simultaneously and with ease. 

We utilised modified transposons that allowed us to generate C-terminal deletion 
derivatives of polypeptides. The methodology should be applicable to any protein, the 
encoding nucleic acid sequence (e.g. a gene) of which is cloned in a plasmid or other 
25 DNA vector. 

In one aspect, the invention features a transposon nucleic acid comprising a genetically 
engineered translation stop signal in three reading frames at least partly within a 
transposon end sequence, or preferably within transposon end binding sequence, 
30 recognised by a transposase. 

In various embodiments the transposon nucleic acid of the invention may contain a 
selectable marker and/or reporter gene. In one preferable embodiment the transposon 
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end sequence of said transposon nucleic acid is Mu end sequence recognised by MuA 
transposase. In one particular embodiment said Mu end sequence is Mu R-end 
sequence. 

5 In another preferred embodiment of the invention the modified transposon is a 
Tn7-derived transposon. 

In a second aspect, the invention provides a method for producing a deletion derivative 
of a polypeptide coding nucleic acid comprising the steps of: 
10 (a) performing a transposition reaction in the presence of the transposon nucleic acid of 
the invention and a target nucleic acid containing a polypeptide coding nucleic acid of 
interest, (b) recovering a target nucleic acid having said transposon nucleic acid 
incorporated in said polypeptide coding nucleic acid. 

15 In a preferred embodiment the method of the invention further comprises a step of (c) 
expressing said polypeptide coding nucleic acid having said transposon nucleic acid 
incorporated. 

In a third aspect, the invention provides a kit for producing deletion derivatives of 
20 polypeptide coding nucleic acids. The kit comprises the transposon nucleic acid of the 
invention. 

In a fourth aspect, the invention features use of the transposon nucleic acid of the 
invention for producing deletion derivatives of polypeptide coding nucleic acids. 

25 

The term "transposon", as used herein, refers to a nucleic acid segment, which is 
recognised by a transposase or an integrase enzyme and which is essential component 
of a functional nucleic acid-protein complex capable of transposition (i.e. a 
transpososome). 

30 

The term "transposase" used herein refers to an enzyme, which is an essential 
component of a functional nucleic acid-protein complex capable of transposition and 
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which is mediating transposition. The term "transposase" also refers to integrases from 
retrotransposons or of retroviral origin. 

The expression "transposition reaction" used herein refers to a reaction wherein a 
5 transposon inserts into a target nucleic acid. Essential components in a transposition 
reaction are a transposon and a transposase or an integrase enzyme or some other 
components needed to form a functional transposition complex. The method and 
materials of the present invention are exemplified by employing in vitro Mu 
transposition (Haapa et al. 1999ab and Savilahti et al. 1995) or transposition system of 
10 Tn7 (Craig, 1996). Other transposition systems can be used as well. Examples of such 
systems are Tyl (Devine and Boeke, 1994, and International Patent Application WO 
95/23875), Tn 10 and IS 10 (Kleckner et al. 1996), Mariner transposase (Lampe et aL, 
1996), Tel (Vos et al, 1996, 10(6), 755-61), Tn5 (Park et al., 1992), P element 
(Kaufinan and Rio, 1992) and Tn3 (Ichikawa and Ohtsubo, 1990), bacterial insertion 
15 sequences (Ohtsubo and Sekine, 1996), retroviruses (Varmus and Brown 1989) and 
retrotransposon of yeast (Boeke, 1989). 

The term "transposon end sequence" used herein refers to the conserved nucleotide 
sequences at the distal ends of a transposon. The transposon end sequences are 
20 responsible for identifying the transposon for transposition. 

The term "transposon end binding sequence" used herein refers to the conserved 
nucleotide sequences within the transposon end sequence whereto a transposase 
specifically binds when mediating transposition. 

25 

The term "target nucleic acid" used herein refers to a nucleic acid molecule containing a 
protein coding nucleic acid of interest. 

The term "translation stop signal" used herein refers to the genetic code, which contains 
30 three codon triplets (UAA, UAG, UGA) for terminating the polypeptide chain 

production during protein synthesis in a ribosome. In a DNA strand the corresponding 
stop signal triplets are TAA, TAG and TGA. 




WO 03/087370 PCT/FI03/00285 

6 

The term "reading frame" used herein refers to any sequence of bases in DNA or RNA 
that codes for the synthesis of either a protein or a component polypeptide. The point of 
initiation of reading determines the frame, i.e. the way in which the bases will be 
grouped in triplets as required by the genetic code. 
5 The term "genetic engineering" used herein refers to molecular manipulation involving 
the construction of artificial recombinant nucleic acid molecules. 

The term "gene" used herein refers to genomic DNA or RNA that are translated into 
polypeptides. 

10 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1. 

15 Cat-Mu transposons: Cat-Mu containing wild type Mu ends, Cat-Mu(jVbtf) containing 
Mu ends with engineered Notl restriction site, which design is described in Laurent et 
al. 2000, and Cat-Mu(Stop x 3) containing Mu ends with engineered translation stop 
signal in three reading frames (SEQ ID NO:2). Transposon end sequences (i.e. inverted 
terminal repeats) are drawn as rectangles. 

20 

Figure 2. 

Transposon end sequences of Cat-Mu transposons: Cat-Mu transposon containing wild 
type Mu ends (SEQ ID NO:3), Cat-Mu(i\fort) containing Mu ends with engineered Notl 
restriction site described in Laurent et al. 2000 (SEQ ID NO:4), and Cat-Mu(Stop x 3) 
25 containing Mu ends with engineered translation stop signal in three reading frames 
(SEQ ID NO:l). Asteriks (*) show modified nucleotides in the Mu ends of Cat- 
Ma(NotJ) and Cat-Mu(Stop x 3). 

Figure 3. 

30 Analysis of C-terminal deletion variants on DNA level. Plasmids bearing Cat-Mu(Stop x 
3) transposon insertions (samples 1-24) were digested with BamHI, and they were 
analyzed on 1,8 % agarose gels. The length of the shortest fragment of each digest 



WO 03/087370 PCT/FI03/00285 

7 

corresponds roughly to the length of the deletion variant protein gene (0- -650 bp). M = 
DNA standards. 



Figure 4. 

5 Analysis of C-terminal deletion variants on protein level. The sizes of the deletion 
variant proteins, as predicted by sequencing analysis, are marked below each lane as 
kilodaltons. M= molecular weight standard, C + = positive control, C = negative control. 
Predicted deletion variant protein products are pointed out by arrows. 

10 Figures SAand 5B 

5A, Mu transposition complex. 5B, the assembly of Mu transposition complex. 

Figure 6. 

Overall strategy for production of C-terminal deletion variants of genes encoding 
15 proteins. 

DETAILED DESCRIPTION OF THE INVENTION 

It has been published previously that protein engineering applications will benefit from 
20 Mu-based transposon strategies since it was established that any DNA sandwiched 
between Mu ends could be utilised as artificial transposons (Haapa et al. 1999a). In, 
principle insertion mutations (e.g. by addition of epitope tags or protein domains) and 
deletion mutations (by addition of translation stop codons) were foreseen with this 
strategy. However, introduction of a translation stop codon between transposon ends 
25 would leave a number of encoded amino acid residues into the protein's C-terminus. 
Given that an effective Mu end is about 50 bp in length, minimally this strategy would 
leave approximately 18 extra amino acids attached in the protein C-terminus. Extra 
amino acids may interfere with the protein function, therefore it would be better to add 
the stop codons as close as possible to the transposon end. By modifying the nucleotides 
30 of the Mu R-end (total of 7 nucleotides were changed, 5 of said nucleotides reside in 
Mu Rl sequence), we managed to place three stop codons in three reading frames very 
close to the Mu R-end resulting in transposons that still surprisingly retained their 
ability to form transposition complexes that were competent for transposition chemistry, 
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i.e. they facilitated the integration of the transposon in vitro into a target plasmid. In 
essence, all the possible C-terminal deletion variants can be generated. 

We designed an artificial Cat-Mu(Stop)-transposon (SEQ ID NO:2) conferring 
5 resistance to chloramphenicol and Tn7-Kan(Stop)-transposon (SEQ ID NO:7) 

conferring resistance to kanamycin. Both contained in their ends modified base pairs 
providing three stop codons in three reading frames (figs. 1 and 2). The gene mediating 
resistance to chloramphenicol is used as a selectable marker. The term "selectable 
marker" refers to a gene that, when carried by a transposon, alters the ability of a cell 
10 harboring the transposon to grow or survive in a given growth enviroment relative to a 
similar cell lacking the selectable marker. The transposon nucleic acid of the invention 
preferably contains a positive selectable marker. A positive selectable marker, such as 
an antibiotic resistance, encodes a product that enables the host to grow and survive in 
the presence of an agent, which otherwise would inhibit the growth of the organism or 
15 kill it. The transposon nucleic acid of the invention may also contain a reporter gene, 
which can be any gene encoding a product whose expression is detectable and/or 
quantitatable by immunological, chemical, biochemical, biological or mechanical 
assays. A reporter gene product may, for example, have one of the following attributes: 
fluorescence (e.g., green fluorescent protein), enzymatic activity (e.g., luciferase, 
20 /acZ/p-galactosidase), toxicity (e.g., ricin) or an ability to be specifically bound by a 

second molecule (e.g., biotin). The use of markers and reporter genes in prokaryotic and 
eukaryotic cells is well-known in the art. In a preferred embodiment the transposon 
nucleic acid of the invention may also contain genetically engineered restriction enzyme 
sites. For example, the selectable marker gene within the transposon of the invention 
25 may influence the protein expression when a construct obtained by the method of the 
invention is inserted into a protein expression plasmid. It is therefore desirable to 
engineer a pair of unique restriction sites to flank the selectable marker gene. The 
marker can then be removed easily by the use of these sites and thus the final expression 
construct would not contain the marker gene. 

30 

Hence, one embodiment of the invention provides a transposon nucleic acid comprising 
a genetically engineered translation stop signal in three reading frames at least partly 
within a transposon end sequence, or preferably within transposon end binding 
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sequence, recognised by a transposase (i.e. at least one conserved nucleotide of the end 
sequence has been modified, preferably two, three, four or more conserved nucleotides 
have been modified). Preferably, the transposon nucleic acid of the invention comprises 
Mu or Tn7 transposon sequence. More preferably the transposon nucleic acid of the 

5 invention comprises Mu R-end sequence, e.g., the sequence of SEQ ID NO: 1 or SEQ 
ID NO:5 (Mu-R end sequence not including 5' overhang, which thus can vary). In a 
transposon end sequence of the transposon nucleic acid of the invention, translation stoi| 
signals of three reading frames are in 5'-to-3 3 direction, preferably in succession close 
to each other at a very end of a transposon, thus the three stop signals are as near as 

10 possible the flanking sequence after the transposon is incorporated into a target. 

Furthermore, the transposon end sequences, which participate in the assembly of the 
transpososome discussed above, can be different from each other or they can be in 
different nucleic acid molecules. Preferably, both transposon end sequences 
participating in the transpososome have similar sequences (i.e. they are located as 

1 5 inverted terminal repeats). 

The transposon nucleic acid of the invention is exemplified here by transposons of Mu 
(Examples 1-3) or Tn7 (Example 4) system. However, a person skilled in the art 
understands that teachings of this invention can be utilised in other transposon systems 
20 as well. 

Another embodiment of the invention is a method for producing a deletion derivative of 
a polypeptide coding nucleic acid comprising the steps of: 
(a) performing a transposition reaction in the presence of a target nucleic acid 
25 containing a polypeptide coding nucleic acid (e.g. a gene) of interest and in the presence 
of a transposon containing a genetically engineered translation stop signal sequence in 
three reading frames at least partly within a transposon end sequence recognised by a 
transposase, (b) recovering a target nucleic acid having said transposon incorporated in 
said gene. 

30 

The transposition reaction (a) includes a transposon in a form of linear DNA molecule, 
transposase protein (e.g. MuA), and a target DNA as macromolecular components. 
Additionally, the transposition reaction contains suitable buffer components including 
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Mg 2+ ions critical for chemical catalysis. Buffer components such as glycerol and 
DMSO (or related chemicals or solvents) somewhat relax the requirements for 
transposition reaction (Savilahti et al. 1995). Transposon DNA, in principle, can be of 
any length given that it in each end contain a transposon (e.g. Mu or Tn7) end sequence. 

5 Typically, target DNA is in a form of circular plasmid. However, any double-stranded 
DNA molecule more than 25 bp is expected to serve as efficient target molecule 
(Savilahti et al. 1995, Haapa-Paananen et al. 2002). In transposition reaction the 
reaction components are incubated together; during the incubation transposition 
complexes first form and then react with target DNA splicing the transposon DNA into 

10 target DNA. This process yields transposon integrations into target molecules. The 
stoichiometry of the reaction (excess target) generates target molecules each with a 
single integrated transposon. Most importantly, the integration site in each molecule can 
be different. Even though some sites in DNA are somewhat more preferred than others 
most of the phosphodiester bonds in DNA will be targeted (Haapa et al. 1999ab, Haapa- 

15 Paananen et al.2002). In practice this means that the integration sites are selected 
essentially randomly. 

In the Examples below deletion mutant libraries were planned to cover the gene of 
interest at least 10-fold, i.e. when the target gene was approximately 600 bp, the final 
20 pool should contain of a minimum of 6000 mutants. As a test protein we utilised 23 kDa 
yeast Msol protein (Aalto et al. 1997). Those skilled in the art can easily design 
different strategies for mutant library construction as such strategies are well-known in 
the art (see, e.g., Sambrook et al. 1989, Sambrook and Russell 2001). 

25 A mutant library was produced as described in Example 2. Target nucleic acids with a 
transposon insertion were isolated by size-selective preparative agarose gel 
electrophoresis. A person skilled in the art may design different isolation methods as 
such methods are well-known in the art (see, for example, Current Protocols in 
Molecular Biology, eds. Ausubel et al, John Wiley & Sons: 1992). We screened 

30 individual deletion mutants by restriction analysis (fig. 3). This analysis demonstrates 
that in the library, there are variants of different sizes. A person skilled in the art can 
easily utilise different screening techniques. The screening step can be performed, e.g., 
by methods involving sequence analysis, nucleic acid hybridisation, primer extension or 
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antibody binding. These methods are well-known in the art (see, for example, Current 
Protocols in Molecular Biology, eds. Ausubel et al, John Wiley & Sons: 1992). 

We sequenced 23 C-terminal mutants derived from Example 2. All the mutants carried 
5 the translation stop codons in three reading frames. 

Finally, the protein expression analysis (fig. 4) demonstrated that different deletion 
variant proteins are produced. Probably due to lack of resolution in the utilised gel 
system, the supposedly expressed protein was not detectable when the deletion 
10 derivative was 8 kDa or smaller. Alternatively, very small versions of the Msol protein 
may be proteolytically degraded inside the cells. 

A further embodiment of the invention is a kit providing means for producing deletion 
derivatives of protein coding nuclear acid sequences. The kit comprises the transposon 
15 nucleic acid of the invention. The kit can be packaged in a suitable container and 
preferably it contains instructions for using the kit. 

The results of the invention show that, unexpectedly, it is possible to substantially 
modify conserved sequences of transposon ends without critically compromising the 

20 competence of the modified transposon to assemble transposition complexes and 
thereafter carry out transposition chemistry. Thus, the invention provides a 
straightforward solution to the problem of extra amino acids attached in the protein C- 
terminus of the deletion derivative which could be produced by a conventional 
transposition system, wherein the transposon used contains the translation stop signals 

25 between the transposon ends. 

The present invention is further described in the following examples, which are not 
intended to limit the scope of the invention. 
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EXAMPLES 
EXAMPLE 1 

In vitro transposition reaction 

In vitro transposition reaction (25 jliI) contained 720 ng cat-Mu(Stop) transposon as a 
donor, 500 ng plasmid pHis6-MS01 as a target nucleic acid, 0.2 jig MuA, 25 mM Tris- 
HC1 at pH 8.0, 100 jig/ml BSA, 15% (w/v) glycerol, 0.05% (w/v) Triton X-100, 126 
mM NaCl and 10 mM MgCl 2 . The reaction was carried out at 30°C for 4h. 

Further details and variables of in vitro Mu transposition are described in Haapa et al. 
1999ab and Savilahti et al. 1995, incorporated herein by reference. 

EXAMPLE 2 
15 

Generation of a pool of mutants with C-terminal deletions in Msol 

In vitro transposition reactions with Stop-Mu were performed essentially as described in 
Haapa et al. (1999a) with the exception that they contained 720 ng donor DNA (Stop- 
Mu x 3) and 0,88 |Xg MuA. Ten reactions were pooled, phenol and chlorophorm 

20 extracted, ethanol precipitated, and resuspended in 30 \i\ of water. Several 1 \il aliquots 
were electrotransformed, each into 25 jal of DH5oc electrocompetent cells, as described 
(Haapa et al 1999a). Transposon-containing plasmid clones were selected on LB plates 
containing Ap and Cm. A total of -6x1 0 5 colonies were pooled and grown in selective 
LB-Ap-Cm medium at 37 °C for 3h after which plasmid DNA was prepared from the 

25 pool with Qiagen Plasmid Midi kit. This plasmid preparation was subjected to a Xhol- 
HindUI double digestion and preparative agarose gel electrophoresis. The DNA 
fragment corresponding to transposon insertions into the Msol -containing DNA 
fragment was isolated with QIAquick Gel Extraction Kit (Qiagen). This fragment was 
then ligated into the plasmid pHis6-MS01 vector XhoI-HindOI backbone to generate a 

30 construct pool with transposon insertions located only within the Msol gene. After 
ligation, a pool of plasmids from ~5xl0 4 colonies was prepared as described above. 
Approximately 110 000 colonies were pooled. Transposon-carrying Msol fragments 
were cloned into clean vector backbone as described above and approximately 1 1 000 
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colonies were pooled in the final C-terminal deletion mutant library. At all stages, the 
transformants were selected with Ap and Cm. 

EXAMPLE 3 

5 

Restriction and expression analysis of deletion mutants 

Mutant clones were analyzed for deletions by BamBI digestion and DNA sequencing. 
For protein expression analysis, single mutant plasmids were introduced into 
BL21(DE3) expression strain. Selective medium was inoculated with o/n culture of 
10 bacteria containing mutant plasmid and grown until OD 6 oo was 0,4-0,7. Protein 

expression was induced with 1 mM IPTG for 3 hours and samples were withdrawn for 
SDS-PAGE analysis. Bacterial lysates were run on 15 % gels and stained with GelCode 
blue stain (Pierce) as recommended by the supplier. 

15 EXAMPLE 4 

Generation of deletion mutants with Tn7-Kan (Stop) transposon 

In vitro Tn7 transposition reaction (20 \x\) contained 40 ng Tn7-Kan (Stop) transposon 
(SEQ ED NO:7) as a donor, 100 ng plasmid pUC19 as a target nucleic acid, 7 ng TnsA 
20 protein, 10 ng TnsB protein, 20 ng TnsC* protein, 25 mM Tris-HCl at pH 8.0, 50 |ig/ml 
BSA, 2 mM DTT and 2 mM ATP. The reaction mixture was pre-incubated at 37°C for 
10 min before addition of 30 mM magnesium acetate. After the addition the reaction 
was carried out at 37°C for 1 h. 

25 The reaction mixture was precipitated with «-butanol to reduce the ionic strength and to 
concentrate DNA prior to electroporation (Thomas, 1994) and resuspended in 10 pi of 
water. 5 pi aliquot was electrotransformed into 50 pi of DH10B (Epicentre 
Technologies) electrocompetent cells. Transposon-containing plasmid clones were 
selected on LB plates containing kanamycin (20 pg/ml). Approximately 20000 

30 kanamycin resistant colonies were recovered per 1 pg target DNA. Three clones were 
picked from the transformation plates and grown in LB-Kn medium at 37°C overnight 
after which plasmid DNA was prepared from the cultures with QiaPrep Spin Miniprep 
Kit. The Tn7-Kan (Stop) transposon insertion sites were analyzed by DNA sequencing. 
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All the mutants carried the translation stop codons in six reading frames and in each 
case, the integrated transposon was flanked by a 5 -bp target site duplication generated 
in TnsABC*-mediated transposition. 

5 MATERIALS AND METHODS 

Bacteria, media, enzymes and reagents 

Bacterial cultures were grown in Luria broth supplemented with appropriate antibiotics: 
ampicillin (Ap) at 100 fig/ml, chloramphenicol (Cm) at 10 jxg/ml and kanamycin (Kn) 

10 at 20 p-g/ml when required. Escherichia coli strains were DH5ct (Life Technologies), 
BL21(DE3) (Novagen), and DH10B (Epicentre Technologies). MuA protein was 
purified in collaboration with Finnzymes (Espoo, Finland) essentially as described 
(Baker et al 1993, Haapa et al 1999a). TnsA, TnsB and TnsC* proteins were 
purchased from New England Biolabs. Restriction enaymes and T4 DNA ligase were 

15 from New England Biolabs and Promega, Triton X-100 from Roche. Standard DNA 
techniques were performed as described (Sambrook and Russell 2001). Enzymes were 
used as recommended by suppliers. Sequencing was carried out at the sequencing 
service unit of the Institute of Biotechnology, University of Helsinki. 

20 Plasmids and transposons 

Plasmid pHis6-MS01 contains the 633 bp Msol gene as an insert (Aalto et al 1997). 
The Cat-Mu(Stop) transposon (1254 bp) is a derivative of the Cat-Mu transposon 
(Haapa et al 1999a), and they encode resistance to chloramphenicol (fig. 1 and 2). The 
Cat-Mu(Stop)-transposon ends were engineered to carry translation stop signals for 

25 both 5'-to-3' directions of dsDNA in all three reading frames. The Tn7-Kan (Stop) 
transposon is a derivative of the pGPSl.l transposon (New England Biolabs) and it 
encodes resistance to kanamycin. The Tn7-Kan (Stop) transposon ends were engineered 
to carry translation stop signals for both S'-to-S 1 directions of dsDNA in all three reading 
frames. Tn7-Kan (Stop) transposon sequence is 4814 bp in length (SEQ ID NO:7) and 

30 nucleotides 3093-4791 set forth in SEQ ID NO:7 constitutes the transposable element. 
Modified nucleotides were at the positions of 3095, 3097, 3099, 3101, 3103, 4781, 
4783, 4785, 4787, and 4789 set forth in SEQ ID NO:7. 
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Tn7-Kan (Stop) transposon was constructed from PCR-amplified fragments. The 
transposable fragment was amplified with primers 5' acg gtg agt gag tag aaa ata gtt ggg 
aac tgg ga 3' (SEQ ID NO:8) and 5' cgt atg agt gag tag aat aaa gtc tta aac tga aca aaa 
tag a 3* (SEQ ID NO:9) using the plasmid pGPSl.l as template DNA (New England 
5 Biolabs) and the vector fragment was amplified with primers 5' aag tag ctt ttc tgt gac tgg 
1 3' (SEQ ID NO: 10) and 5' gat ggc atg aca gta aga get 3' (SEQ ID NO:l 1) using the 
plasmid pGPSl.l (New England Biolabs) as template DNA. 

Sequencing was performed using the primer 5 '-get agt tat tgc tea gcg g-3' (SEQ ID 
10 NO:5). Sequencing of Tn7-Kan (Stop) transposon insertion sites in pUC19 plasmid was 
carried out using Model 4200 DNA Sequencer (LI-COR). Sequencing was performed 
using IRD700-labeled primers 5' age tgg cga aag ggg gat gtg 3' (SEQ ID NO: 12) and 5' 
tta tgc ttc egg etc gta tgt tgt gt 3' (SEQ ID NO: 13). 
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1. A transposon nucleic acid comprising a genetically engineered translation stop signal 
in three reading frames at least partly within a transposon end sequence recognised by a 
transposase. 

2. The transposon nucleic acid according to claim 1, wherein said transposon contains a 
selectable marker and/or a reporter gene. 

3. The transposon nucleic acid according to claim 1 or 2, wherein said transposon end 
sequence is Mu or Tn7 end sequence. 

4. The transposon nucleic acid according to any one of claims 1-3, wherein said 
transposon end sequence is a transposon end binding sequence. 

5. The transposon nucleic acid according to claim 3, wherein Mu end sequence is Mu 
R-end binding sequence. 

6. The transposon nucleic acid according to claim 5, wherein said transposon sequence 
is set forth in SEQ ID NO:l, SEQ ID NO:2 or SEQ ID NO:5. 

7. The transposon nucleic acid according to claim 3, wherein said transposon sequence 
is set forth in SEQ ID NO:7. 

8. The transposon nucleic acid according to any one of the preceding claim, wherein 
said transposon further contains a genetically engineered restriction enzyme site. 

9. Method of producing a deletion derivative of a polypeptide coding nucleic acid 
comprising the steps of: 

(a) performing a transposition reaction in the presence of a target nucleic acid 
containing a polypeptide coding nucleic acid of interest and in the presence of a 
transposon containing a genetically engineered translation stop signal sequence in three 
reading frames at least partly within a transposon end sequence recognised by a 
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transposase, (b) recovering a target nucleic acid having said transposon incorporated in 
said protein coding nucleic acid. 

10. The method according to claim 9 further comprising a step of (c) expressing said 
5 protein coding nucleic acid having said transposon incorporated. 

11. The method according to claim 9 or 10, wherein said transposon comprises the 
transposon nucleic acid of any one of claims 2-8. 

10 12. A kit for producing deletion derivatives of polypeptide coding nucleic acids 
comprising the transposon nucleic acid of any one of claims 1-8. 

13. Use of the transposon nucleic acid of any one of claims 1-8 for producing deletion 
derivatives of polypeptide coding nucleic acids. 
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SEQUENCE LISTING 



<110> Finnzymes Oy 

<120> Method and materials for producing deletion derivatives 
of proteins 

<130> STOP-MU 

<140> 
<141> 

<160> 13 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Modified Mu 
end sequence 

<400> 1 

gatctgattg attgaacgaa aaacgcgaaa gcgtttcacg ataaatgcga aaac 54 



<210> 2 
<211> 1254 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Modified Mu 
transposon 



gatctgattg attgaacgaa aaacgcgaaa gcgtttcacg ataaatgcga aaacggatcc 60 
tatcgtcaat tattacctcc acggggagag cctgagcaaa ctggcctcag gcatttgaga 120 
agcacacggt cacactgctt ccggtagtca ataaaccggt aaaccagcaa tagacataag 180 
cggctattta acgaccctgc cctgaaccga cgaccgggtc gaatttgctt tcgaatttct 240 
gccattcatc cgcttattat cacttattca ggcgtagcaa ccaggcgttt aagggcacca 300 
ataactgcct taaaaaaatt acgccccgcc ctgccactca tcgcagtact gttgtaattc 360 
attaagcatt ctgccgacat ggaagccatc acaaacggca tgatgaacct gaatcgccag 420 
cggcatcagc accttgtcgc cttgcgtata atatttgccc atggtgaaaa cgggggcgaa 480 
gaagttgtcc atattggcca cgtttaaatc aaaactggtg aaactcaccc agggattggc 540 
tgagacgaaa aacatattct caataaaccc tttagggaaa taggccaggt tttcaccgta 600 
acacgccaca tcttgcgaat atatgtgtag aaactgccgg aaatcgtcgt ggtattcact 660 
ccagagcgat gaaaacgttt cagtttgctc atggaaaacg gtgtaacaag ggtgaacact 720 
atcccatatc accagctcac cgtctttcat tgccatacgt aattccggat gagcattcat 780 
caggcgggca agaatgtgaa taaaggccgg ataaaacttg tgcttatttt tctttacggt 840 
ctttaaaaag gccgtaatat ccagctgaac ggtctggtta taggtacatt gagcaactga 900 
ctgaaatgcc tcaaaatgtt ctttacgatg ccattgggat atatcaacgg tggtatatcc 960 
agtgattttt ttctccattt tagcttcctt agctcctgaa aatctcgaca actcaaaaaa 1020 
tacgcccggt agtgatctta tttcattatg gtgaaagttg gaacctctta cgtgccgatc 1080 
aacgtctcat tttcgccaaa agttggccca gggcttcccg gtatcaacag ggacaccagg 1140 
atttatttat tctgcgaagt gatcttccgt cacaggtatt tattcggtcg aaaaggatcc 1200 
gttttcgcat ttatcgtgaa acgctttcgc gtttttcgtt caatcaatca gate 1254 



<400> 2 
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<210> 3 
<211> 54 
<212> DNA 

<213> Bacteriophage Mu 
<400> 3 

gatctgaagc ggcgcacgaa aaacgcgaaa gcgtttcacg ataaatgcga aaac 54 



<210> 4 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Modified Mu 
end sequence 

<400> 4 

gatctgcggc cgcgcacgaa aaacgcgaaa gcgtttcacg ataaatgcga aaac 54 



<210> 5 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Modified Mu 
end sequence without 5 1 overhang 

<400> 5 

tgattgattg aacgaaaaac gcgaaagcgt ttcacgataa atgcgaaaac 50 



<210> 6 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Sequencing 
primer 

<400> 6 

gctagttatt gctcagcgg 19 



<210> 7 
<211> 4814 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Modified Tn7 
transposon 

<400> 7 

ggtaccctgt gaatgcgcaa accaaccctt ggcagaacat atccatcgcg tccgccatct 60 
ccagcagccg cacgcggcgc atctcgggca gcgttgggtc ctggccacgg gtgcgcatga 120 
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tcgtgctcct gtcgttgagg acccggctag gctggcgggg ttgccttact ggttagcaga 180 
atgaatcacc gatacgcgag cgaacgtgaa gcgactgctg ctgcaaaacg tctgcgacct 240 
gagcaacaac atgaatggtc ttcggtttcc gtgtttcgta aagtctggaa acgcggaagt 300 
cagcgccctg caeca ttatg ttceggatet atgtcgggtg eggagaaaga ggtaatgaaa 360 
tggcagatcc ctggcttgtt gtccacaacc gttaaacctt aaaagcttta aaagecttat 420 
atattctttt ttttcttata aaacttaaaa ccttagaggc tatttaagtt gctgatttat 480 
attaatttta ttgttcaaac atgagagctt agtacgtgaa acatgagagc ttagtacgtt 540 
agecatgaga gcttagtacg ttagccatga gggtttagtt cgttaaacat gagagcttag 600 
tacgttaaac atgagagctt agtacgtgaa acatgagagc ttagtacgta ctatcaacag 660 
gttgaactgc tgatcttegg atctatgtcg ggtgcggaga aagaggtaat gaaatggcag 720 
atccctggct tgttgtccac aaccgttaaa ccttaaaagc tttaaaagee ttatatattc 780 
ttttttttct tataaaactt aaaaccttag aggctattta agttgctgat ttatattaat 840 
tttattgttc aaacatgaga gcttagtacg tgaaacatga gagcttagta cgttagccat 900 
gagagcttag tacgttagcc atgagggttt agttcgttaa acatgagagc ttagtacgtt 960 
aaacatgaga gcttagtacg tgaaacatga gagcttagta egtactatea acaggttgaa 102 0 
ctgetgatet teggatctat gtcgggtgcg gagaaagagg taatgaaatg geatceggat 108 0 
ctgcatcgca ggatgetget ggctaccctg tggaacacct acatctgtat taacgaagca 1140 
ttattgaagc atttatcagg gttattgtct catgagegga tacatatttg aatgtattta 1200 
gaaaaataaa caaatagggg ttccgcgcac atttccccga aaagtgccac ctgaegtcta 1260 
agaaaccatt attatcatga cattaaccta taaaaatagg egtatcaega ggccctttcg 1320 
tcttcaagaa ttctcatgtt tgacagctta tcatcgataa getttaatge ggtagtttat 1380 
cacagttaaa ttgetaaege agtcaggcac cgtgtatgaa atctaacaat gcgctcatcg 1440 
tcatcctcgg caccgtcacc ctggatgctg taggcatagg cttggttatg ccggtactgc 1500 
cgggcctctt gegggatate gtccattccg acagcatcgc cagtcactat ggcgtgctgc 1560 
tagegctata tgcgttgatg caatttctat gcgcacccgt teteggagea ctgtccgacc 1620 
gctttggccg ccgcccagtc ctgctcgctt cgctacttgg agccactatc gaetacgega 1680 
teatggegae cacacccgtc ctgtggatcc tctacgccgg aegcategtg gccggcatca 1740 
ccggcgccac aggtgcggtt gctggcgcct atategcega catcaccgat ggggaagatc 1800 
gggctcgcca cttcgggctc atgagegett gtttcggcgt gggtatggtg gcaggccccg 1860 
tggccggggg actgttgggc gccatctcct tgcatgcacc attccttgeg gcggcggtgc 1920 
tcaacggcct caacctacta ctgggctgct tectaatgea ggagtegcat aagggagagc 198 0 
gtcgaccgat geccttgaga gccttcaacc cagtcagctc cttccggtgg gegeggggea 2040 
tgactatcgt cgccgcactt atgactgtct tctttatcat geaactegta ggacaggtgc 2100 
cggcagcgct ctgggtcatt tteggegagg accgctttcg ctggagcgcg acgatgatcg 2160 
gcctgtcgct tgeggtatte ggaatcttgc acgccctcgc tcaagccttc gtcactggtc 222 0 
ccgccaccaa aegtttegge gagaagcagg ccattatcgc cggcatggcg gccgacgcgc 2280 
tgggctacgt cttgctggcg ttcgcgacgc gaggctggat ggccttcccc attatgattc 2340 
ttctcgcttc cggcggcatc gggatgcccg cgttgcaggc catgctgtcc aggcaggtag 2400 
atgacgacca tcagggacag cttcaaggat cgctcgcggc tcttaccagc ctaacttcga 2460 
tcattggacc getgategtc aeggegattt atgccgcctc ggegagcaca tggaacgggt 2520 
tggcatggat tgtaggegee gccctatacc ttgtctgcct ccccgcgttg cgtcgcggtg 2580 
catggagccg ggccacctcg acctgaatgg aagceggegg cacctcgcta aeggattcac 2640 
cactccaaga attggageca atcaattctt geggagaact gtgaatgege aaaccaaccc 2700 
ttggcagaac atatccatcg cgtccgccat ctccagcagc cgcacgcggc geatcteggg 2760 
cagcgttggg tcctgggctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc 2820 
cataccgcca gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta 2880 
acccgtatcg tgagcatcct ctctcgtttc ateggtatea ttacccccat gaacagaaat 2940 
cccccttaca eggaggcate agtgaccaaa caggaaaaaa ccgcccttaa catggcccgc 3000 
tttatcagaa gecagacatt aacgettctg gagaaactca acgagctgga cgcggatgaa 3060 
caggcagagc tcttactgtc atgccatccg tatgagtgag tagaataaag tcttaaactg 3120 
aacaaaatag atctaaacta tgacaataaa gtcttaaact agacagaata gttgtaaact 3180 
gaaatcagtc cagttatget gtgaaaaagc atactggact tttgttatgg ctaaagcaaa 3240 
ctcttcattt tctgaagtgc aaattgcccg tegtattaaa gaggggegtg gggtcgaege 3300 
ggccgctaac tataaeggtc ctaaggtagc gagtttaaac gatateggat ccggccgccg 3360 
ctgaggtctg cctcgtgaag aaggtgttgc tgactcatac caggectgaa tcgccccatc 3420 
atccagccag aaagtgaggg agecaeggtt gatgagagct ttgttgtagg tggaccagtt 3480 
ggtgattttg aacttttget ttgccacgga aeggtctgeg ttgtcgggaa gatgegtgat 3540 
ctgatccttc aactcagcaa gagttcgatt tattcaacaa agccgccgtc ccgtcaagtc 3600 
agegtaatge tetgecagtg ttacaaccaa ttaaccaatt ctgattagaa aaactcatcg 3660 
agcatcaaat gaaactgcaa tttattcata tcaggattat caataccata tttttgaaaa 3720 
ageegtttet gtaatgaagg agaaaactca ccgaggcagt tccataggat ggcaagatcc 3780 
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tggtatcggt ctgcgattcc gactcgtcca acatcaatac aacctattaa tttcccctcg 3840 

tcaaaaataa ggttatcaag tgagaaatca ccatgagtga cgactgaatc cggtgagaat 3900 

ggcaaaagct tatgcatttc tttccagact tgttcaacag gccagccatt acgctcgtca 3960 

tcaaaatcac tcgcatcaac caaaccgtta ttcattcgtg attgcgcctg agcgagacga 4020 

aatacgcgat cgctgttaaa aggacaatta caaacaggaa tcgaatgcaa ccggcgcagg 4080 

aacactgcca gcgcatcaac aatattttca cctgaatcag gatattcttc taatacctgg 4140 

aatgctgttt tcccggggat cgcagtggtg agtaaccatg catcatcagg agtacggata 4200 

aaatgcttga tggtcggaag aggcataaat tccgtcagcc agtttagtct gaccatctca 42 60 

tctgtaacat cattggcaac gctacctttg ccatgtttca gaaacaactc tggcgcatcg 4320 

ggcttcccat acaatcgata gattgtcgca cctgattgcc cgacattatc gcgagcccat 4380 

ttatacccat ataaatcagc atccatgttg gaatttaatc gcggcctcga gcaagacgtt 4440 

tcccgttgaa tatggctcat aacacccctt gtattactgt ttatgtaagc agacagtttt 4500 

attgttcatg atgatatatt tttatcttgt gcaatgtaac atcagagatt ttgagacaca 4560 

acgtggctta ctaggatccg atatcattta aatctaggga taacagggta atactagtgt 4620 

cgaccaacca gataagtgaa atctagttcc aaactatttt gtcattttta attttcgtat 4680 
tagcttacga cgctacaccc agttcccatc tattttgtca ctcttcccta aataatcctt 474 0 
aaaaactcca tttccacccc tcccagttcc caactatttt ctactcactc accgtaagat 4800 

gcttttctgt gact 4814 



<210> 8 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 



<210> 9 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 9 

cgtatgagtg agtagaataa agtcttaaac tgaacaaaat aga 43 



<210> 10 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 



<400> 8 

acggtgagtg agtagaaaat agttgggaac tggga 



35 



<400> 10 

aagtagcttt tctgtgactg gt 



22 



<210> 11 
<211> 21 



WO 03/087370 PCT/FI03/00285 

5/5 

<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 11 

gatggcatga cagtaagagc t 21 



<210> 12 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 

<400> 12 

agctggcgaa agggggatgt g 



<210> 13 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 
Oligonucleotide primer 



<400> 13 

ttatgcttcc ggctcgtatg ttgtgt 
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